37 research outputs found

    Segmentaci贸n y posicionamiento 3D de robots m贸viles en espacios inteligentes mediante redes de c谩maras fijas

    Get PDF
    La presente tesis doctoral surge con el objetivo de realizar contribuciones para la segmentaci贸n, identificaci贸n y posicionamiento 3D de m煤ltiples robots m贸viles. Para ello se utiliza un conjunto de c谩maras calibradas y sincronizadas entre s铆, que se encuentran ubicadas en posiciones fijas del espacio en que se mueven los robots (espacio inteligente). No se contar谩 con ning煤n conocimiento a priori de la estructura de los robots m贸viles ni marcas artificiales a bordo de los mismos. Tanto para la segmentaci贸n de movimiento como para la estimaci贸n de la posici贸n 3D de los robots m贸viles se propone una soluci贸n basada en la minimizaci贸n de una funci贸n objetivo, que incorpora informaci贸n de todas las c谩maras disponibles en el espacio inteligente. Esta funci贸n objetivo depende de tres grupos de variables: los contornos que definen la segmentaci贸n sobre el plano imagen, los par谩metros de movimiento 3D (componentes de la velocidad lineal y angular en el sistema de referencia global) y profundidad de cada punto de la escena al plano imagen. Debido a que la funci贸n objetivo depende de tres grupos de variables, para su minimizaci贸n se emplea un algoritmo greedy, iterativo, entre etapas. En cada una de estas etapas dos de los grupos de variables se suponen conocidos, y se resuelve la ecuaci贸n para obtener el restante. De forma previa a la minimizaci贸n se realiza la inicializaci贸n tanto de las curvas que definen los contornos de la segmentaci贸n como de la profundidad de cada punto perteneciente a los robots. Adem谩s se requiere la estimaci贸n del n煤mero de robots presentes en la escena. Partiendo de que las c谩maras se encuentran en posiciones fijas del espacio inteligente, la inicializaci贸n de las curvas se lleva a cabo comparando cada imagen de entrada con un modelo de fondo obtenido previamente. Tanto para el modelado de fondo, como para la comparaci贸n de las im谩genes de entrada con el mismo se emplea el An谩lisis de Componentes Principales Generalizado (GPCA). Respecto a la profundidad se emplea Visual Hull 3D (VH3D) para relacionar la informaci贸n de todas las c谩maras disponibles, obteniendo un contorno aproximado de los robots m贸viles en 3D. Esta reconstrucci贸n de los robots proporciona una buena aproximaci贸n de la profundidad inicial de todos los puntos pertenecientes a los robots. Por otro lado, el uso de una versi贸n extendida de la t茅cnica de clasificaci贸n k-medias permite obtener una estimaci贸n del n煤mero de robots presentes en la escena. Tras la segmentaci贸n de movimiento y la estimaci贸n de la posici贸n 3D de todos los objetos m贸viles presentes en la escena, se procede a la identificaci贸n de los robots m贸viles. Esta identificaci贸n es posible debido a que los robots m贸viles son agentes controlados por el espacio inteligente, de forma que se cuenta con informaci贸n acerca de las medidas de los sensores odom茅tricos a bordo de los mismos. Para el seguimiento se propone el uso de un filtro de part铆culas extendido con proceso de clasificaci贸n (XPFCP). La elecci贸n de este estimador se debe a que, dado su car谩cter multimodal, permite el seguimiento de un n煤mero variable de elementos (robots m贸viles) empleando para ello un 煤nico estimador, sin necesidad de incrementar el vector de estado. Los resultados obtenidos a la salido del XPFCP son una buena estimaci贸n de la posici贸n de los robots m贸viles en un instante posterior, por lo que esta informaci贸n se realimenta a la etapa de inicializaci贸n de variables, permitiendo reducir el tiempo de procesamiento consumido por la misma. Las diferentes soluciones propuestas a lo largo de la tesis han sido validadas de forma experimental utilizando para ello diferentes secuencias de im谩genes (con presencia de diferentes robots, personas, diversos objetos, cambios de iluminaci贸n, etc.) adquiridas en el espacio inteligente del Departamento de Electr贸nica de la Universidad de Alcal谩 (ISPACE-UAH)

    Real-time human action recognition using raw depth video-based recurrent neural networks

    Get PDF
    This work proposes and compare two different approaches for real-time human action recognition (HAR) from raw depth video sequences. Both proposals are based on the convolutional long short-term memory unit, namely ConvLSTM, with differences in the architecture and the long-term learning. The former uses a video-length adaptive input data generator (stateless) whereas the latter explores the stateful ability of general recurrent neural networks but is applied in the particular case of HAR. This stateful property allows the model to accumulate discriminative patterns from previous frames without compromising computer memory. Furthermore, since the proposal uses only depth information, HAR is carried out preserving the privacy of people in the scene, since their identities can not be recognized. Both neural networks have been trained and tested using the large-scale NTU RGB+D dataset. Experimental results show that the proposed models achieve competitive recognition accuracies with lower computational cost compared with state-of-the-art methods and prove that, in the particular case of videos, the rarely-used stateful mode of recurrent neural networks significantly improves the accuracy obtained with the standard mode. The recognition accuracies obtained are 75.26% (CS) and 75.45% (CV) for the stateless model, with an average time consumption per video of 0.21 s, and 80.43% (CS) and 79.91%(CV) with 0.89 s for the stateful one.Agencia Estatal de Investigaci贸nUniversidad de Alcal

    Fast heuristic method to detect people in frontal depth images

    Get PDF
    This paper presents a new method for detecting people using only depth images captured by a camera in a frontal position. The approach is based on first detecting all the objects present in the scene and determining their average depth (distance to the camera). Next, for each object, a 3D Region of Interest (ROI) is processed around it in order to determine if the characteristics of the object correspond to the biometric characteristics of a human head. The results obtained using three public datasets captured by three depth sensors with different spatial resolutions and different operation principle (structured light, active stereo vision and Time of Flight) are presented. These results demonstrate that our method can run in realtime using a low-cost CPU platform with a high accuracy, being the processing times smaller than 1 ms per frame for a 512 脳 424 image resolution with a precision of 99.26% and smaller than 4 ms per frame for a 1280 脳 720 image resolution with a precision of 99.77%

    People re-identification using depth and intensity information from an overhead sensor

    Get PDF
    This work presents a new people re-identification method, using depth and intensity images, both of them captured with a single static camera, located in an overhead position. The proposed solution arises from the need that exists in many areas of application to carry out identification and re-identification processes to determine, for example, the time that people remain in a certain space, while fulfilling the requirement of preserving people's privacy. This work is a novelty compared to other previous solutions, since the use of top-view images of depth and intensity allows obtaining information to perform the functions of identification and re-identification of people, maintaining their privacy and reducing occlusions. In the procedure of people identification and re-identification, only three frames of intensity and depth are used, so that the first one is obtained when the person enters the scene (frontal view), the second when it is in the central area of the scene (overhead view) and the third one when it leaves the scene (back view). In the implemented method only information from the head and shoulders of people with these three different perspectives is used. From these views three feature vectors are obtained in a simple way, two of them related to depth information and the other one related to intensity data. This increases the robustness of the method against lighting changes. The proposal has been evaluated in two different datasets and compared to other state-of-the-art proposal. The obtained results show a 96,7% success rate in re-identification, with sensors that use different operating principles, all of them obtaining depth and intensity information. Furthermore, the implemented method can work in real time on a PC, without using a GPU.Ministerio de Econom铆a y CompetitividadAgencia Estatal de Investigaci贸nUniversidad de Alcal

    Self-Triggered Formation Control of Nonholonomic Robots

    Get PDF
    In this paper, we report the design of an aperiodic remote formation controller applied to nonholonomic robots tracking nonlinear, trajectories using an external positioning sensor network. Our main objective is to reduce wireless communication with external sensors and robots while guaranteeing formation stability. Unlike most previous work in the field of aperiodic control, we design a self-triggered controller that only updates the control signal according to the variation of a Lyapunov function, without taking the measurement error into account. The controller is responsible for scheduling measurement requests to the sensor network and for computing and sending control signals to the robots. We design two triggering mechanisms: centralized, taking into account the formation state and decentralized, considering the individual state of each unit. We present a statistical analysis of simulation results, showing that our control solution significantly reduces the need for communication in comparison with periodic implementations, while preserving the desired tracking performance. To validate the proposal, we also perform experimental tests with robots remotely controlled by a mini PC through an IEEE 802.11g wireless network, in which robots pose is detected by a set of camera sensors connected to the same wireless network

    A new framework for deep learning video based Human Action Recognition on the edge

    Get PDF
    Nowadays, video surveillance systems are commonly found in most public and private spaces. These systems typically consist of a network of cameras that feed into a central node. However, the processing aspect is evolving towards distributed approaches, leveraging edge-computing. These distributed systems are capable of effectively addressing the detection of people or events at each individual node. Most of these systems, rely on the use of deep-learning and segmentation algorithms which enable them to achieve high performance, but usually with a significant computational cost, hindering real-time execution. This paper presents an approach for people detection and action recognition in the wild, optimized for running on the edge, and that is able to work in real-time, in an embedded platform. Human Action Recognition (HAR) is performed by using a Recurrent Neural Network (RNN), specifically a Long Short-Term Memory (LSTM). The input to the LSTM is an ad-hoc, lightweight feature vector obtained from the bounding box of each detected person in the video surveillance image. The resulting system is highly portable and easily scalable, providing a powerful tool for real-world video surveillance applications (in the wild and real-time action recognition). The proposal has been exhaustively evaluated and compared against other state-of-the-art (SOTA) proposals in five datasets, including four widely used (KTH, WEIZMAN, WVU, IXMAX) and a novel one (GBA) recorded in the wild, that includes several people performing different actions simultaneously. The obtained results validate the proposal, since it achieves SOTA accuracy within a much more complicated video surveillance real scenario, and using a lightweight embedded hardware.European CommissionAgencia Estatal de Investigaci贸nUniversidad de Alcal

    Remote control of a robotic unit: a case study for control engineering formation

    Get PDF
    Hands-on experimentation has widely demonstrated its efficacy in engineering training, especially in control formation, since experimentation using computer-aided control system design (CACSD) tools is essential for future engineers. In this context, this article describes a case study for Control Engineering formation, based on a new lab practice for the linear and angular velocity control for a commercial P3-DX robot platform, to teach industrial control. This lab proposal includes all the stages involved in the design of a real control system, from plant identification from an open-loop test to real experimentation of the designed control system. The lab practices proposed have a twofold objective: First, it is an interdisciplinary approach that allows students to put into practice the skills from other subjects in the curriculum, facilitating the integration of knowledge. In addition, it allows increasing the motivation of the students by working with a complex and realistic plant. The proposal has been evaluated through the grades of the students, as well as the perception of both students and instructors, and the results obtained allow to confirm the benefits of the proposal.Universidad de Alcal

    Towards dense people detection with deep learning and depth images

    Get PDF
    This paper describes a novel DNN-based system, named PD3net, that detects multiple people from a single depth image, in real time. The proposed neural network processes a depth image and outputs a likelihood map in image coordinates, where each detection corresponds to a Gaussian-shaped local distribution, centered at each person?s head. This likelihood map encodes both the number of detected people as well as their position in the image, from which the 3D position can be computed. The proposed DNN includes spatially separated convolutions to increase performance, and runs in real-time with low budget GPUs. We use synthetic data for initially training the network, followed by fine tuning with a small amount of real data. This allows adapting the network to different scenarios without needing large and manually labeled image datasets. Due to that, the people detection system presented in this paper has numerous potential applications in different fields, such as capacity control, automatic video-surveillance, people or groups behavior analysis, healthcare or monitoring and assistance of elderly people in ambient assisted living environments. In addition, the use of depth information does not allow recognizing the identity of people in the scene, thus enabling their detection while preserving their privacy. The proposed DNN has been experimentally evaluated and compared with other state-of-the-art approaches, including both classical and DNN-based solutions, under a wide range of experimental conditions. The achieved results allows concluding that the proposed architecture and the training strategy are effective, and the network generalize to work with scenes different from those used during training. We also demonstrate that our proposal outperforms existing methods and can accurately detect people in scenes with significant occlusions.Ministerio de Econom铆a y CompetitividadUniversidad de Alcal谩Agencia Estatal de Investigaci贸

    3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information

    Get PDF
    This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people"s privacy, since their identities can not be recognized from these data. The proposed 3DFCNN has been optimized to reach a good performance in terms of accuracy while working in real-time. Then, it has been evaluated and compared with other state-of-the-art systems in three widely used public datasets with different characteristics, demonstrating that 3DFCNN outperforms all the non-DNNbased state-of-the-art methods with a maximum accuracy of 83.6% and obtains results that are comparable to the DNN-based approaches, while maintaining a much lower computational cost of 1.09 seconds, what significantly increases its applicability in real-world environments.Agencia Estatal de Investigaci贸nUniversidad de Alcal
    corecore